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THE LYAPUNOV MATRIX EQUATION. MATRIX ANALYSIS FROM 
A COMPUTATIONAL PERSPECTIVE* 

V. SIMONCINlt 

Abstract. Decay properties of the solution X to the Lyapunov matrix equation AA + AA T = D 
are investigated. Their exploitation in the understanding of equation matrix properties, and in the 
development of new numerical solution strategies when D is not low rank but possibly sparse is also 
briefly discussed. 
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1. Introduction. We are interested in the analysis of the linear matrix equation 

AX + XA 7 = D, A,DGR nxn , (1.1) 

to be solved for X e R" xn ; here and in the following A T denotes the transpose of the 
matrix A. In particular, we focus on the decay and sparsity properties of the involved 
matrices that can be exploited for computational purposes, or that can give insight 
into the analysis of numerical solution methods. 

Matrix equations have always had a key role in Control theory, because their 
solution matrix carries information on the stability of the problem mm- More re¬ 
cently, linear matrix equations and their generalizations, linear “tensor” equations, 
have been shown to be an appropriate tool to represent the discretization of parame¬ 
terized partial differential equations, as they arise for instance in stochastic modeling; 
see, e.g., mmm and the discussion in [29} . Their analysis and numerical solu¬ 
tion is therefore attracting considerable attention in the numerical and engineering 
communities. 


Using the Kronecker product, the matrix equation (1.1) can be rewritten as the 
following standard (vector) linear system 


Ax = b, 


with 


A — I n ® A -)- A £$ In 


( 1 . 2 ) 


x = vec(X), b = vec(D), 

where the Kronecker product of two matrices A' and Y of size n x x m x and n y x m 9 , 
respectively, is defined as 


X<g)Y = 


Xu Y 

Xl 2 Y ■ ■ 

’ ^lrriA^ 

X21Y 

x 22 Y ■ ■ 

%2 rriA^ 

_Xn x l Y 

a nx 2Y ■ ■ 

' ^rixTrix^ m 


G C 


n x n y X m x m y . 


the vec operator stacks the columns of a matrix X = \x \,..., x m \ G C nxm one after 
the other as 


vec(A) = 


Xl 
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From (1.21 we can deduce that the system admits a solution for any b and this 
is unique, if and only if the matrix A is nonsingular. Using a standard result for the 
spectrum of Kronecker product matrices, this is equivalent to requiring that spec(H)n 
spec(— A) = 0, where spec(H) denotes the set of eigenvalues of A (see, e.g., (TUI 
Theorem 4.4.6]). 

Though the Kronecker form (1.21 of the problem is appealing, as a large body 
of literature on linear systems can be exploited, this approach dramatically increases 
the complexity of the computation, and also cannot preserve the intrinsic properties 
of the problem in practice; for instance, if D is symmetric, then the matrix X is also 
symmetric. This property is not preserved when solving (1.2), unless the system is 
solved at a very high accuracy. 

The numerical solution of 0 is particularly challenging when n is large. Indeed, 
although A and D may be sparse and/or structured, the solution matrix X is usually 
dense. For n = O(10 5 ) or higher, storing the full matrix X becomes prohibitive, 
and either sparse or low-rank approximations are sought. Experimental evidence and 
theoretical results indicate that low rank approximations can indeed be sought after 
whenever the right-hand side D is itself a low rank matrix. Major efforts in the past 
decades have been devoted to determining approximations of the lowest possible rank, 
given a fixed final accuracy. Interestingly, strategies that determine the most accurate 
approximation for a given rank have also been investigated. We refer to [25] for a 
recent survey on recent computational strategies. 

Sparsity and decay (quasi-sparsity) patterns of the solution matrix X have been 
less investigated. We aim to explore recent findings in this direction: while decay 
pattern results are proved by using the closed forms above and thus they seem to be 
of merely theoretical interest, they may be insightful in the development of computa¬ 
tional procedures and in the analysis of the expected solution properties. 

Decay properties have been classically investigated for the entries of the inverse 
of banded matrices. Let A = (a.ij), i,j = 1... ,n be a symmetric positive definite 
banded matrix with bandwidth /3, that is cq.j = 0 for |i — j\ > /3, and let k > 1 be 
the ratio between its largest and smallest eigenvalues. Then Demko et al. m showed 
that 


\A 1 \ij < C 0 q 3 J| (1.3) 

where q = (y^c — 1)/( v / k + 1) < 1, and Cq also depends on the extreme eigenvalues of 
A. This bound shows an exponential off-diagonal decay property as the inverse matrix 
entry moves away from the main diagonal, and the decay depends on the bandwidth 
of A. This property was generalized to functions of symmetric matrices in {? and 
more recently to more general matrices and abstract settings in mm- 

With completely different purposes (see [10]), in [IF it was shown that also the 
Lyapunov operator 


£ :X AX + 

enjoys similar properties. More precisely, for A symmetric and positive definite, the 
entries of the matrix £ _1 (ejej) show a rapid, although not necessarily exponential, 
decay pattern as one moves away from the element in position (i,j). Here we would 
like to linger over this property, exploring some of its possible consequences both the¬ 
oretically and computationally. The main aim of this paper is to discuss some recent 
results, and highlight directions of research that could improve our understanding of 
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the matrix properties of ( 1 . 1 ), and perhaps encourage the development of new com¬ 


putational strategies for its numerical solution when D is not low rank but possibly 
sparse. 

We will first generalize the decay pattern result to a wider class of matrices of 
interest in applications. Then we will show how to use this decay pattern to character¬ 
ize the approximate solution matrix of large Lyapunov equations by projection-type 
methods. Finally, we will make some comments on the solution pattern for sparse D. 

All reported experiments and plots were performed using Matlab [25]. 


2. Closed forms of the matrix solution. The solution X of (1.1) may be 


written in closed form in a number of different ways; here we report the forms that 
have been used in the literature: 

(a) Integral of resolvents. The following representation, due to Krein, exploits 
spectral theory arguments: (see, e.g., : 2T| ) 


1 1 °° 

X = — (uni - A)~ l D(uiI - A)~ E duj, 

2lT j -oo 


( 2 . 1 ) 


where I is the n x n identity matrix and Ti is a contour containing and 
sufficiently close to the spectrum of A. 

( b) Integral of exponentials. Assume that the field of value^jof A is all contained 
either in C _ or in C + , excluding the imaginary axis. This representation, due 
to Heinz, in Satz 5], is tightly connected to the previous one, 

/•OO 

X = / e At De tA ” dt, (2.2) 

Jo 

where e At is the matrix exponential of At. 

(c) Finite power sum. Assume D = BB 1 with B £ R nxs , s < n, and let a m 
of degree to be the minimal polynomial of A with respect to B , namely the 
smallest degree monic polynomial such that a m (A)B = 0. Then ( jl2j) 


m— 1 m —1 


X =E E^w H ) 


£=0 j =o 


= [H,AB,...,A m - 1 B]( 7®/) 


B E 

B E A E 


B E (A E ) m ~ 1 


(2.3) 


where 7 is the solution of the Lyapunov equation with coefficient matrices 
given by the companion matrix of a mi and right-hand side the matrix e\e\ 1 
where e\ = [1, 0,..., 0]; see also [23]. 

( d ) Similarity transformations. Strictly related to (c), in addition this form as¬ 
sumes that A can be diagonalized, U~ 1 AU = diag(Ai,..., A„). Let D = 
U~ 1 DU~ E . Then 


X = UXU 1 , with Xij 


Djj 

A i + p,j 


1 The field of values of A E R riXrl is defined as W{A) = {2 S C, z = x E Ax,x G C n ,x H x = 1}. 
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These closed forms are not used for computational purposes, since stable and 
efficient methods have been derived already in the 1970s [ 22 ] ■ Nonetheless, they can 
be used to describe other properties, such as entry decay - as we are going to show - 
and numerical low rank of the solution matrix |29j . 


3. Decay pattern properties of the solution. The sparsity pattern of the 
matrix D influences the pattern of the solution to (1.1), which can be qualitatively 
described a-priori. This pattern is typical of certain discretizations of elliptic partial 
differential equations. For instance, when discretizing the Poisson equation —A u = f 
on the unit square by means of finite differences or uniform low degree finite elements, 
the algebraic linear system Ax = b is obtained, where A = A<g>I + I<g>Ais called the 
stiffness matrix, and A is symmetric and tridiagonal, A = tridiag(—1, 2, —1); see, e.g., 
m section 6.3.3]. Clearly, this problem is equivalent to the Lyapunov equation (1.1) 
with x = vec(X) and b = vec (D). The particular pattern we are analyzing is derived 
whenever / is, say, a single-point forcing term. Analogously, the analysis of the decay 
pattern in X may give insight into the understanding of the sparsity pattern of the 
stiffness matrix A, since each column t of A -1 , A~ 1 et is nothing but the solution to 
Ax = e t HU. Our analysis helps formalize the oscillating - while decaying - pattern 
observed in the inverse stiffness matrix. 

Example 3.1. Let A = tridi ag(— 1, 2, —1) e R" xri , n = 10 and let t = 35, so that 


D = e tl e [ 2 , t\ = 5, t 2 = 4. Figure 3.1 shows the connection between the column A 1 e t 


(left plot) and the solution X (right plot on the mesh) of the corresponding problem 
AX + XA = e^e^. The plots illustrate the correspondence of each component of the 
vector as an entry of the matrix A', thus describing the oscillating behavior of the 
column A~ 1 et- Note that since A is also banded with bandwidth j3 = 10, the overall 
decay is justified by the estimate in (1.3), while the Kronecker structure is responsible 
for the oscillations. 


3.1. Decay pattern of the solution matrix. Assume the right-hand side 
consists of a very sparse matrix, as an extreme case, having a single nonzero element, 
so that 


AX + XA = £ t , £ t = e i 'ye], 77 ^ 0 . 

Without loss of generality, in the following we shall assume that 7 = 1 . Then using 
the integral closed form in ( 2 . 1 ) we can write 

1 r°° 1 r°° 

X = — / [iujI + A)~ 1 eie](iujl + A) -H dw =— / ZiZ E duj, (3.1) 
2tt J_ x 2tt J 


where Zi = ( iujI + A)~ 1 ei . Let S be the class oinxn ^-banded matrices S of the form 
S = a\I + a 2 «S'o with So = S “ (Hermitian) and aq, 02 £ C, and denote by [Ai, A 2 ] the 
line segment containing the eigenvalues of a matrix S 6 S. Note that if A £ S, also 
the matrix uni + A in (3.1) belongs to S. The set S includes real symmetric matrices 
(for oq = 0 and a = 1), but also complex skew-Hermitian matrices (a = 0 and a = i), 
and complex shifted matrices, as they arise, for instance, in the discretization of the 
Helmoltz equation. All matrices in § are normal matrices, that is for S £ S it holds 
that SS E = S E S. 


To be able to characterize the decay pattern in the solution X, we recall a result 
derived in 021 - 

Theorem 3.2. Let A £ §, a = (A 2 + Ai)/(A 2 — Ai) and R > 1 be defined as 
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Fig. 3.1. Column of X and corresponding element on the grid. 


R = a + \Ja 2 — 1, with a = (|Ai| + |A 2 1)/1A 2 — Ai|. Then 


2 R 

|Ai — A2I 


B(a) 



l<-i| 


\e\A 1 e i \ < 


> (■ 7 ^ i, 
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where, writing a = an cos (ip) + i[3r sin (if) 
B(a) := 


Ii 


cos 2 (if) (a r + \Jol\- cos 2 (if)) ’ 

with a R = |(i? + ^) and Pr = \(R — ^)- 

With this result, the following bound for the entries of the Lyapunov solution 
matrix can be obtained; in spite of the more general setting, the proof proceeds as in 
m and it is therefore omitted. 

Theorem 3.3. Foruj £ R, let uni + A £ S, with eigenvalues contained in the line 
segment [Ai,A 2 ] := [A m i n (A) + coi, A max (T) + un]. With the notation of Theorem 3.2 
and for k = (k±, /c 2 ) and t = (ti, ^ 2 )> ki , U £ { 1 ,..., n}, i = 1 , 2 , the following holds, 
i) If t\ 7 ^ k\ and t 2 7 ^ k 2 , then 


IPOfci.fcal < 


1 


64 


2 ?r |A 2 - Ai | 2 J _ 00 

ii) If either t\ = k\ or t 2 = fc 2 , then 


R 2 


(R 2 - l ) 5 


2 / -^ \ |*i— ki\/0+\t 2 —k 2 \/0 —2 


R 


do;; 


\{X) klM \ < 


1 


1 


R 2 


2 7T |A 2 - Ail v/IAiP+w 2 (R 2 ~ l) 2 

in) If both i = t and j = m, then 


Y \ \ti — ki\/0+\t 2 —k 2 \/0—l 


R 


do;; 


1 1 

\{X) klM \ < —J^ |Ai| 2 +w 2 d W 


2 |Ai|' 


Example 3.4. We consider the n x n nonsymmetric tridiagonal matrix M = 
tridiag(l, 2, —1), n = 100, and D = bb T with b = e. 50 ; note that the same pattern 
is obtained with its complex counterpart M = tridiag(i, 2, i) £ S. Figure 3.2 shows 
the solution pattern in linear (left) and logarithmic (right) scale. The solution entries 
decay very quickly away from the entry (50,50). This fact can be appreciated by 
noticing the 2 -axis in the logarithmic scale. 



Fig. 3.2. Decay pattern of X for Example \ ?-4\ Left: linear scale. Right: logarithmic scale. 

In cases of interest in applications, such as in the case of problems stemming from 
certain discretizations of elliptic partial differential equations, the matrix A itself may 
have a structure of sum of Kronecker products, that is 


A = M®I + I®M , 


(3.2) 
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where M is y/n x y/n and banded. We use once again (3.1) for expressing e^Ie^, by 
first assuming that a lexicographic order was used to generate the indexes k±, t\ 


on 


the two-dimensional grid, so that we can identify k\ = (fen, ^ 12 ), t\ = (in, in)- Then 


we can see that the inner product inside the integral satisfies ej. (itul + A) 1 et 1 = 
e\ Z tl ek 12 where Z tl solves the Lyapunov equation 


M + -uni 


Z Z ( Ad —coil J — e tll e. 


Therefore, in a recursive manner, the decay of the entries in the matrix Z can be 
described in terms of Theorem |3.3| In practice, this pattern results in a “local” 
oscillation associated with the decay of the columns of Z, and a “global” oscillation, 
associated with the decay of X. 

Example 3.5. A typical sample of this behavior is shown in Figure [373] where 
M = tridiag(—1, 2, —1) £ yfn = 10, so that A £ R nxra , and B = eso- 



Fig. 3.3. Pattern of solution X with A in 1(3.2]) . Left: linear scale. Middle: logarithmic scale. 
Right: 48/A column of X (slice of left plot in linear scale). 


Combining the Kronecker structure of A with the one in (3.2) of the Lyapunov 
equation, we can reformulate the problem as 

(M®/®/®I + J®M®/®/ + I®J®M®/-|-I®J®I® Ad)x = 6, 

where b = vec (BB T ), and all identity matrices have dimension y/n. This form reveals 
the actual 4x4 tensorial nature of the solution A' in case A has the described Kronecker 
structure. Computational methods to solve for x while using only information in R''/™ 
together with the tensor structure have been proposed; see, e.g., [22] ■ Here we just 
want to emphasize that the quality of the decay in the entries of X, and thus of x , 
is all determined by that of the banded matrix M, while the location of this decay 
pattern is ruled by the Kronecker form of A, as nicely visible in the middle plot of 
Figure [331 


3.2. Numerical low rank properties of the solution matrix. In the litera¬ 


ture there is very large experimental evidence that if the right-hand side in (1.11 has 


low rank, then the eigenvalues of the solution X decay very rapidly. In some cases, 
such as for A symmetric and positive definite, this behavior has been proved [26j . 
Bounds that cope with non-normality have also been derived, see, e.g., [23 sec.3.1.2] 
and ^]. Theoretical results showing the fast decay of the spectrum rely on the low 
rank of the matrix D , and on the spectral properties of the coefficient matrix A. We 
are not aware of results that combine the truncation strategies associated with the 
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matrix decay pattern with the low rank property. Both strategies aim at drastically 
reducing the memory requirements of the solution matrix, and in general, sparsity- 
driven truncated matrices may have larger numerical rank than the original matrices. 
However, the following example provides an interesting setting, which will be worth 
exploring in the future. 

Example 3.6. Let A = tridiag(— 1,4, — 1) £ R" xn , n = 100 and D = BB 1 
with B = [e 50 , ..., e 60 ]. The left plot of Figure [ih4| shows the solution pattern. The 
symmetric and positive semidefinite matrix X has 9724 nonzero elements. Moreover, 
25 eigenvalues of A' are above 10 -14 , therefore X can be very precisely reproduced 
as X = XiXf with a tall matrix Xi having rank (and thus number of columns) as 
low as 25. The right plot shows the sparsity pattern of the matrix X obtained by 
zeroing all entries of X below 10 -5 (the largest entry of X is 0(1))- The matrix X 
has preci sely 19 nonzero singular values, and only 219 nonzero entries (see right plot 
of Figure 3.4). The accuracy of X is not as good as with the rank-25 matrix, since 
||A — X\\ ~ 10 -5 , however for most applications this approximation will be rather 
satisfactory. Though this example is not sufficiently general to make a general case, 
this phenomenon deserves further analysis. 



Fig. 3.4. Solution X for Example \3. 6j Left: 
9724■ Right: Sparsity pattern of truncated version 


pattern of X with logarithmic scale, imz(X) = 
of X: all entries below Hr 5 are omitted. 


4. Numerical considerations. Sparsity and quasi-sparsity properties are now 
starting to be considered in the numerical approximation of the solution to (1.1). In 


the following we shall discuss two situations where the analysis of the previous sections 
can provide insights into the understanding and development of numerical strategies. 

4.1. Decay issues in the approximate solution by projection. Let A be 


n x n and sparse, D = BB T with B £ 


:s , s < n. An effective strategy for 
approximately solving (1.1) consists of determining a good approximation space K m 
of small dimension to, and then generating X m = V m YV.Z ~ X where the orthonormal 
columns of V m £ R nxm span K m . The matrix Y £ R mxm is determined by imposing 
some extra condition. For instance, a possible strategy requires that the residual 
R m := AX m + X m A T — BB T satisfies the following (orthogonality) condition: 


Inserting the expression for the residual, and recalling that VZ V m = I m , this condition 
leads to the following reduced Lyapunov equation 

(V*AV m )Y + Y(VZA T V m ) = VZBB T V m . 


(4.1) 
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Setting T m = Vj n AV m and B m = V^B, the matrix Y is thus determined by solving 
the equation T rn Y + YTj n = of (small) size m; this can be performed by using a 

“dense” method, such as the popular Bartels-Stewart algorithm, whose computational 
cost is 0(m 3 ) [l]. The effectiveness of the whole approximation thus depends on how 
small to can be and still obtain a satisfactory approximation. 

The choice of K m is crucial to obtain a rich enough approximation space while 
keeping its dimension to small. We refer the reader to the recent survey [29] for a 
comprehensive description of these issues and the relevant recent literature. Here we 
consider the simplest possible choice of approximation space in the class of Krylov 
subspace methods: assuming for ease of presentation B = b € R", we define the 
(standard) Krylov subspace as the vector space 

K m = K m (A, b) := span{6, Ab ,..., A m ~ 1 b}. 

As m increases, K m+ i is obtained from K m by one additional multiplication by A 
of the last basis vector. An orthonormal basis can be progressively constructed as 
to increases by using the Arnoldi procedure m , giving rise to the columns of V m ; 
in particular, b = V^ei||6|| - The Arnoldi recurrence also establishes the following 
relation 


AV m — T v m +it m +ie, 


T 

mi 


(4.2) 


where v m +i is the next vector of the basis, and t m +i is computed during the recur¬ 
rence. 

A nice feature of the Arnoldi procedure, is that the reduced matrix T m = V^AVm 


is upper Hessenberg, so that the cost of solving the reduced matrix equation in (4.11 


is lower than for a full matrix. For A symmetric, the Arnoldi procedure reduces to the 
Lanczos procedure, and most importantly in our context, T m is also symmetric and 
thus tridiagonal. With these hypotheses, the sparsity considerations of the previous 


sections become relevant. Indeed, the solution Y to (4.1) can be written in closed 
form as 

i r°° _ 

Y= 2n lI ~ r m)‘* e ill ft ll e i(w*J - T m )~ n du, BB 1 = ( ei ||6||)( ei ||6||) T . 


It is also important to keep in mind that a new Y will be constructed each time the 
Krylov subspace is enlarged by one, since T m will be extended to T m+1 by the addition 
of one row and one column. 

The entries of Y can be bounded as described in Theorem |3.3[ where however 
the right-hand side this time has a fixed nonzero entry, the one corresponding to the 
(1,1) position. Thus, 


Y = Y — 


1 

27T 


/ OO 

ej(unl - Tm) -1 e-ipW 2 e\(unl - T m y n ejduj, (4.3) 

-OO 


Since T m is tridiagonal, and thus banded with bandwidth (3 = 1, the quan tities 
|ej (uni — T m ) _1 ei| undergo the exponential decay described in Theorem 3.2 with 


the rate (1/I?)I* _1 L Here R is associated with the spectral properties of T m . We 
recall here that due to the Courant-Fisher min-max theorem, the eigenvalues of T m 
tend to approximate the eigenvalues of A when A is symmetric, therefore the spectral 
properties of T m will eventually be associated with those of A , as to grows. 
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Remark 4.1. Our presentation is based on (3 which exploits decay properties 
of the inverses of shifted matrices described in Theorem \ 3.2\ Qualitatively similar 
results could be obtained by using the exponential closed form in (£.£$ , by using very 
recently developed bounds for the entries of the exponential matrix 


It is important to realize that the presence of a fixed column index ei in (4.31 
provides additional structure to the decay pattern. For instance, all diagonal entries 
of Y have a decreasing pattern, as 


12 r °o 

/ \e[{u>il — T m )~ 1 ei 

J — OO 


Yii = 


27r 


|2 II L||2 


do;, 


so that, using Theorem 3.3 

IIZJI2 


I (50<,il < 


64 

|A max A m j n p J—oo 


R 2 

(.R 2 - l ) 2 


2 / 1 \ 2\i—l\—2 

R 


dw, i > 1; 


where R is as described in Theorem |3.2[ with T m real symmetric. More explicit 
estimates can be obtained by bounding the integral from above, as done, e.g., in E3; 
we avoid these technicalities in this presentation. 

The bound above allows us to distinguish between the decay of the entries of \Y\ 
and that of the entries of IT” 1 ], since in the latter case, the diagonal entries do not 
necessarily decay. Moreover, the integrand for the nondiagonal entry Y)_ 7 , j ^ i will 
contain the term 


1 \ |i-l|+|j-l|-2 

r) • « >1 ’ 

illustrating a superexponential decay of the antidiagonals, as i grows. 

Example 4.2. In Figure |4T] a typical pattern is shown for Y : for this example, 
A £ ]R nxn , n = 900 is the finite difference discretization of the two-dimensional 
Laplace operator in the unit square with homogeneous boundary conditions, and 
B = b is taken to be a vector with random uniformly distributed values in the interval 
(0,1). A standard Krylov subspace of dimension m = 30 was considered, so that Y 
is a real symmetric 30 x 30 matrix. 



Fig. 4.1. Decay pattern of solution Y of the reduced Lyapunov equation B A e R"™, 
n = 900 is the discretization of the two-dimensional Laplacian in the unit square, b is a random 
vector and m = 30. 


The analysis of the decay pattern of Y appears to be new. On the other hand, it 
is well known that the last row (or column) of Y carries information on the accuracy 
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of the solution X m . Indeed, let ||I?|| be the Frobenius (|| • ||f) or induced 2-norm 
(II ' II 2 II) °f the matrix R. Then using (4.2) the residual norm satisfies (see pil l 

\\R\\ = | \AV m YV£ + V m YVZA T - V m e4b\\ 2 e\Vl\\ 

= || V m T m YVl + V m YT^Vl - V mei r i|2 " TT ' T 


= IIK 


ra+1 


e l v m 

T m Y + V m YT T - erlHlMV* t m+1 e T m Y 

tm+l 

tm+ie m V 


0 


ChI 


0 

r G-m tm+ 1 


0 


where in the last equality the orthogonality of the columns of V m 
used. Hence, 

» = | ^m+1 ||pe m || 2 , \\R\\ F = V2\t m+1 \\\Ye m \U 


and (4.1) were 


Therefore, if the last column of Y, Ye m , is small in norm then the residual will be 
small. We note that on the other hand, the possibility that |t m +i| is small is related 
to the fact that an invariant subspace is found while generating the space K m via the 
Arnoldi recurrence (4.2), which is in general a less likely event, for general b and small 
to compared to n. 

Due to the previous discussion on the entries \Y h j\. for j = m the vector ||Ve m || is 
expected to be small for to large enough (we recall here that Y changes and increases 
its dimension as to increases). For to > 3 the norm is bounded as 



Usi »/£(W = ( 1 -T )/( 1 _J_) = _£ 


k -0 


R R m - 1 


R R - 1 R: n 


, after some algebra we obtain 


||Ve m || < 


HI 5 


64 


R 8 


1 R m - 1 


27t |A max - A min | 2 (R - 1 )(R 2 - l) 4 R™ R" 


dw 


< 


l&ll S 


64 


R 8 


27T |A max - A min | 2 J_ x (R - l)(i? 2 - l) 4 R' 


-do; 


where we used that R R < 1. More explicit estimates could be obtained by further 
bouding the integrand in terms of the variable uj. 

The relation between the convergence rate, in terms of the residual norm, and 
the solution sparsity pattern should not come as a surprise. Indeed, the estimate in 
Theorem |3.2| exploits the same polynomial approximation properties that are used 
to prove convergence of standard Krylov subspace methods to solve the Lyapunov 
equation by projection 3U . 
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4.2. Numerical solution for a sparse right-hand side. Projection type 
methods are suitable when the right-hand side matrix D has low rank, as discussed 
in the previous sections. Sparse but larger rank right-hand sides can also enjoy par¬ 
ticular properties, and may also be viewed as sum of low rank matrices. For instance, 
assume that A is symmetric positive definite and banded. Due to the linearity of the 
matrix equation, the solution X to 


AX + XA = D, D = diag(<Si, ..., S n ), 


(4.4) 


can be split as X = X 1 + ■ ■ ■ + X n , where for all j G {1,... n} such that Sj ^ 0 the 
addend matrix Xj is the solution to the corresponding equation 


AX + XA = ejSje]; 


see, e.g., jJU] for a similar linearity strategy. It follows from our discussion on decay 
that each Xj will have an a-priori detectable decay pattern - a peak corresponding to 
the (j,j) entry - from which the decay pattern of the whole of X can be predicted; 
see, e.g., m for a similar discussion on the sparsity property of the resulting solution. 
We remark that our decay analysis suggests that each Xj may be truncated to main¬ 
tain sparsity in the approximate solution Xj, so as to obtain a good enough spa rse 
approximate solution X. We next report a typical example; see also Example 3.6 
Example 4.3. For A equal to the Laplace operator as before, the left plot of 


Figure 4.2 reports the solution to (4.4) when D is a nonsingular diagonal matrix with 


random entries uniformly distributed in the interval (0,1); the right plot corresponds 
to D = diag(5i,..., S n ) with Sj equal to a random value as above, but only for 
j = 50,..., 70. The different sparsity patterns of the two solutions is as expected 
from the theory. In particular, the left plot shows slightly larger values of the diagonal 
entries corresponding to the central part of the diagonal, where all linear components 
Xj contribute. On the other hand, the right plot confirms that (diagonal) peaks can 
only occur in correspondence with the nonzero diagonal entries of D. 



Fig. 4.2. Decay pattern of solution X of the Lyapunov equation with right-hand side a diagonal 
matrix D. Left: D nonsingular. Right: nonzero elements in D only corresponding to the diagonal 
entries from 50 to 70. A different viewpoint is used in the two plots. 


5. Further considerations. Our presentation was aimed at highlighting some 
decay properties of the matrices involved in the solution of the Lyapunov equation, 
that may give insight into the development of new numerical methods. In particular, 
the solution of the matrix equation when the right-hand side matrix D has large or 
even full rank remains a big challenge. Exploiting the possible sparsity of D may 
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provide a good solution strategy; we refer the reader to m and its references for 
recent developments in this direction. 

For ease of presentation, our analysis was limited to the Lyapunov equation. Very 
similar results can be obtained for the Sylvester linear equation A\X + XA 2 = D , 
where Ai, A -2 may be of different dimensions; indeed, the solution can be written with 
a closed form similar to that in (2.1|, so that corresponding decay bounds can be 
obtained; see [25J. 

Most of our results can be generalized to the case of A nonnormal and diagonal- 
izable; this can be performed by replacing Theorem 3.2 with corresponding results for 
functions of nonsymmetric matrices, developed for instance in |8L 
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